智能论文笔记

Impact of loss functions on the performance of a deep neural network designed to restore low-dose digital mammography

Hongming Shan , Rodrigo de Barros Vimieiro , Lucas Rodrigues Borges , Marcelo Andrade da Costa Vieira , Ge Wang

分类：计算机视觉

2021-11-12

数字乳房X光检查仍然是乳腺癌筛选最常见的成像工具。虽然使用数字乳房X线照相术用于癌症筛查的益处超过了与X射线曝光相关的风险，但是辐射剂量必须尽可能低，同时保持所产生的图像的诊断效用，从而最大限度地减少患者风险。许多研究通过使用深神经网络恢复低剂量图像来调查剂量降低的可行性。在这些情况下，选择适当的培训数据库和丢失功能至关重要，并影响结果的质量。在这项工作中，提出了一种修改了具有分层跳过连接的Reset架构，以恢复低剂量数字乳房X光检查。我们将恢复的图像与标准的全剂量图像进行比较。此外，我们评估了此任务的几个损失函数的性能。出于培训目的，我们从回顾性临床乳腺X线摄影考试的400次图像数据集中提取了256,000个图像贴片，其中模拟了不同的剂量水平以产生低和标准剂量对。为了在真实情况下验证网络，使用物理拟人乳房乳房映射来在商业上可获得的乳房X线摄影系统中获得真实的低剂量和标准全剂量图像，然后通过我们培训的模型处理。以前呈现的低剂量数字乳房X线摄影的分析恢复模型用作这项工作中的基准。通过信噪比（SNR）进行客观评估，并且平均归一化平方误差（MNSE），分解成残余噪声和偏置。结果表明，感知损失功能（PL4）能够实现全剂量采集的几乎相同的噪声水平，同时导致与其他损耗功能相比较小的信号偏差。

translated by 谷歌翻译

A Physics-Informed Neural Network to Model Port Channels

Marlon S. Mathias , Marcel R. de Barros , Jefferson F. Coelho , Lucas P. de Freitas , Felipe M. Moreno , Caio F. D. Netto , Fabio G. Cozman , Anna H. R. Costa , Eduardo A. Tannuri , Edson S. Gomi

分类：机器学习

2022-12-20

We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.

translated by 谷歌翻译

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Otávio Parraga , Martin D. More , Christian M. Oliveira , Nathan S. Gavenski , Lucas S. Kupssinskü , Adilson Medronha , Luis V. Moura , Gabriel S. Simões , Rodrigo C. Barros

分类：机器学习 | 人工智能 | 自然语言处理 | 计算机视觉

2022-11-10

Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

MonoByte: A Pool of Monolingual Byte-level Language Models

Hugo Abonizio , Leandro Rodrigues de Souza , Roberto Lotufo , Rodrigo Nogueira

分类：自然语言处理

2022-09-22

在多语言甚至单语言中鉴定的模型的零拍跨语言能力刺激了许多假设，以解释这一有趣的经验结果。但是，由于预处理的成本，大多数研究都使用公共模型的公共模型，其预处理方法（例如代币化，语料库规模和计算预算的选择）可能会大不相同。当研究人员对自己的模型预识时，他们通常会在预算有限的情况下这样做，并且与SOTA模型相比，最终的模型的表现可能明显不足。这些实验差异导致有关这些模型跨语性能力的性质的各种不一致的结论。为了帮助对该主题进行进一步研究，我们发布了10个单语字节级模型，并在相同的配置下进行了严格审慎的概述，并具有大型计算预算（相当于V100的420天）和Corpora，比原始BERT大4倍。由于它们不含令牌，因此消除了看不见的令牌嵌入的问题，从而使研究人员可以在具有不同脚本的语言中尝试更广泛的跨语言实验。此外，我们释放了在不自然语言文本上预测的两个模型，这些模型可用于理智检查实验。关于质量检查和NLI任务的实验表明，我们的单语模型实现了多语言的竞争性能，因此可以加强我们对语言模型中跨语性可传递性的理解。

translated by 谷歌翻译

Mapless Navigation of a Hybrid Aerial Underwater Vehicle with Deep Reinforcement Learning Through Environmental Generalization

Ricardo B. Grando , Junior C. de Jesus , Victor A. Kich , Alisson H. Kolling , Rodrigo S. Guerra , Paulo L. J. Drews-Jr

分类：机器人 | 人工智能

2022-09-13

先前的工作表明，深-RL可以应用于无地图导航，包括混合无人驾驶空中水下车辆（Huauvs）的中等过渡。本文介绍了基于最先进的演员批评算法的新方法，以解决Huauv的导航和中型过渡问题。我们表明，具有复发性神经网络的双重评论家Deep-RL可以使用仅范围数据和相对定位来改善Huauvs的导航性能。我们的深-RL方法通过通过不同的模拟场景对学习的扎实概括，实现了更好的导航和过渡能力，表现优于先前的方法。

translated by 谷歌翻译

Deterministic and Stochastic Analysis of Deep Reinforcement Learning for Low Dimensional Sensing-based Navigation of Mobile Robots

Ricardo B. Grando , Junior C. de Jesus , Victor A. Kich , Alisson H. Kolling , Rodrigo S. Guerra , Paulo L. J. Drews-Jr

分类：机器人 | 人工智能

2022-09-13

深钢筋学习中的确定性和随机技术已成为改善运动控制和各种机器人的决策任务的有前途的解决方案。先前的工作表明，这些深-RL算法通常可以应用于一般的移动机器人的无MAP导航。但是，他们倾向于使用简单的传感策略，因为已经证明它们在高维状态空间（例如基于图像的传感的空间）方面的性能不佳。本文在执行移动机器人无地图导航的任务时，对两种深-RL技术 - 深确定性政策梯度（DDPG）和软参与者（SAC）进行了比较分析。我们的目标是通过展示神经网络体系结构如何影响学习本身的贡献，并根据每种方法的航空移动机器人导航的时间和距离提出定量结果。总体而言，我们对六个不同体系结构的分析强调了随机方法（SAC）更好地使用更深的体系结构，而恰恰相反发生在确定性方法（DDPG）中。

translated by 谷歌翻译

Class-Aware Attention for Multimodal Trajectory Prediction

Bimsara Pathiraja , Shehan Munasinghe , Malshan Ranawella , Maleesha De Silva , Ranga Rodrigo , Peshala Jayasekara

分类：计算机视觉

2022-08-31

预测周围动态剂的未来轨迹是自动驾驶中的必要要求。这些轨迹主要取决于周围的静态环境以及这些动态剂的过去运动。此外，代理意图的多模式性质使轨迹预测问题更具挑战性。所有现有模型都同样考虑目标剂以及周围的剂，而无需考虑物理特性的变化。在本文中，我们为自动驾驶中的多模式轨迹预测提供了一个新颖的基于深度学习的框架，该框架考虑了目标及周围车辆的物理特性，例如对象类及其物理尺寸通过加权注意模块，从而改善预测的准确性。我们的模型在Nuscenes轨迹预测基准测试中取得了最高的结果，这些模型是使用栅格图来输入环境信息的模型。此外，我们的模型能够实时运行，达到300 fps的高推理率。

translated by 谷歌翻译

Study of General Robust Subband Adaptive Filtering

Yi Yu , Hongsen He , Rodrigo C. de Lamare , Badong Chen

分类：机器学习

2022-08-04

在本文中，我们提出了一种一般稳健的子带自适应滤波（GR-SAF）方案，以防止冲动噪声，通过在随机步行模型下以各个重量不确定性最小化均方根偏差。具体而言，通过选择不同的缩放因子，例如在GR-SAF方案中从M-估计和最大correntropy robust标准中选择，我们可以轻松获得不同的GR-SAF算法。重要的是，所提出的GR-SAF算法可以简化为可变的正则化鲁棒归一化的SAF算法，从而具有快速的收敛速率和低稳态误差。在系统识别的背景下，用冲动噪声和回声取消进行双词的模拟已证实，所提出的GR-SAF算法的表现优于其对应物。

translated by 谷歌翻译

Sequence-aware multimodal page classification of Brazilian legal documents

Pedro H. Luz de Araujo , Ana Paula G. S. de Almeida , Fabricio A. Braz , Nilton C. da Silva , Flavio de Barros Vidal , Teofilo E. de Campos

分类：自然语言处理

2022-07-02

巴西最高法院每学期收到数万案件。法院员工花费数千个小时来执行这些案件的初步分析和分类 - 这需要努力从案件管理工作流的后部，更复杂的阶段进行努力。在本文中，我们探讨了来自巴西最高法院的文件多模式分类。我们在6,510起诉讼（339,478页）的新型多模式数据集上训练和评估我们的方法，并用手动注释将每个页面分配给六个类之一。每个诉讼都是页面的有序序列，它们既可以作为图像存储，又是通过光学特征识别提取的相应文本。我们首先训练两个单峰分类器：图像上对Imagenet进行了预先训练的重新编织，并且图像上进行了微调，并且具有多个内核尺寸过滤器的卷积网络在文档文本上从SCRATCH进行了训练。我们将它们用作视觉和文本特征的提取器，然后通过我们提出的融合模块组合。我们的融合模块可以通过使用学习的嵌入来处理缺失的文本或视觉输入，以获取缺少数据。此外，我们尝试使用双向长期记忆（BILSTM）网络和线性链条件随机字段进行实验，以模拟页面的顺序性质。多模式方法的表现都优于文本分类器和视觉分类器，尤其是在利用页面的顺序性质时。

translated by 谷歌翻译